Skip to content

Support DPPL 0.37 #2550

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 38 commits into
base: breaking
Choose a base branch
from
Draft

Support DPPL 0.37 #2550

wants to merge 38 commits into from

Conversation

mhauru
Copy link
Member

@mhauru mhauru commented May 15, 2025

Currently in a very unfinished state.

@yebai
Copy link
Member

yebai commented May 18, 2025

Lots of interface code changes here are unnecessary and are planned to be removed in #2413

Perhaps we should address #2413 first.

@mhauru
Copy link
Member Author

mhauru commented May 20, 2025

That is tempting, to not waste time fixing code that's on its way out. I worry though that removing the samplers will take a while still, and in the meanwhile all the accumulator stuff, and other DPPL changes that build on it, would be held back from Turing.jl. For instance, introducing ValuesAsInModelAccumulator would cut our inference time in #2542 by half.

@penelopeysm
Copy link
Member

penelopeysm commented Jul 17, 2025

IMO, it goes both ways. Reducing sampler complexity would make this PR easier. On the other hand, merging this PR would also make it easier to remove the duplicate samplers. (To be precise, DPPL 0.37 would make it easier.)

I think DPPL 0.37 has taken a long time and we should prioritise this, rather than trying to squeeze in the changes to samplers.

I'm going to fix the merge conflicts and add a [sources] bit to Project.toml to point to the unreleased DPPL branch, so that CI can at least run on 1.11. (See https://pkgdocs.julialang.org/v1/toml-files/#The-%5Bsources%5D-section)

Note that 1.10 CI will always fail as [sources] isn't understood on 1.10.

test/mcmc/hmc.jl Outdated
Comment on lines 174 to 175
# TODO(mhauru) Do we give up being able to sample from only prior/likelihood like this,
# or do we implement some way to pass `whichlogprob=:LogPrior` through `sample`?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sample(::LogDensityFunction) would solve that, since we could set getlogprior in the LDF, but I guess that'll be the next release.

Comment on lines -196 to -201
# TODO(penelopeysm): Can we just use leafcontext(model.context)? Do we
# need to pass in the sampler? (In fact LogDensityFunction defaults to
# using leafcontext(model.context) so could we just remove the argument
# entirely?)
DynamicPPL.SamplingContext(rng, spl, DynamicPPL.leafcontext(model.context));
adtype=spl.alg.adtype,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As established in (e.g.) TuringLang/DynamicPPL.jl#955 (comment) SamplingContext for Hamiltonians was never overloaded so it is equivalent to just use DefaultContext in the LDF.

Comment on lines +51 to +54
Exactly like DynamicPPL.LogPriorAccumulator, but does not include the log determinant of the
Jacobian of any variable transformations.

Used for MAP optimisation.
Copy link
Member

@penelopeysm penelopeysm Jul 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am actually quite confused as to why this is needed (and OptimizationContext before it) because if the VarInfo is linked, calling getlogp on it won't include the logjac. (This is the source of the problem in #2617)

This would only be needed if we wanted to unconditionally ignore logjac even if the VarInfo is unlinked.

Copy link
Member

@penelopeysm penelopeysm Jul 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I think there's a terminology mismatch here. I must be thinking of logjac of the invlink transform (and that's also what #2617 refers to), whereas this is talking about logjac of the link transform.

i.e. you want to plug in values in linked space and get the invlinked logp as the optimisation target, ignoring the logjac required to transform values to linked space.

whereas the way I've been seeing it is that we can sample in linked space but we need to make sure to include the logjac of the invlink transform so that we get the invlinked logp.

@penelopeysm
Copy link
Member

Because Turing re-exports some things that were changed in DPPL 0.37 (for example, LogDensityFunction), this has to go into breaking. (It probably philosophically should anyway, since I think this might be the biggest makeover that DPPL has had in a while.)

@penelopeysm penelopeysm changed the base branch from main to breaking July 19, 2025 14:46
Copy link
Contributor

Turing.jl documentation for PR #2550 is available at:
https://TuringLang.github.io/Turing.jl/previews/PR2550/

@penelopeysm
Copy link
Member

This is getting too unwieldy, unfortunately.

Copy link

codecov bot commented Jul 20, 2025

Codecov Report

Attention: Patch coverage is 31.61290% with 106 lines in your changes missing coverage. Please review.

Project coverage is 24.79%. Comparing base (465642e) to head (11a2a31).

Files with missing lines Patch % Lines
src/mcmc/particle_mcmc.jl 0.00% 21 Missing ⚠️
src/optimisation/Optimisation.jl 69.64% 17 Missing ⚠️
src/mcmc/mh.jl 0.00% 12 Missing ⚠️
ext/TuringOptimExt.jl 0.00% 9 Missing ⚠️
src/mcmc/ess.jl 0.00% 8 Missing ⚠️
src/mcmc/is.jl 0.00% 8 Missing ⚠️
src/mcmc/gibbs.jl 36.36% 7 Missing ⚠️
src/mcmc/prior.jl 0.00% 5 Missing ⚠️
src/mcmc/sghmc.jl 0.00% 5 Missing ⚠️
ext/TuringDynamicHMCExt.jl 0.00% 3 Missing ⚠️
... and 5 more

❗ There is a different number of reports uploaded between BASE (465642e) and HEAD (11a2a31). Click for more details.

HEAD has 20 uploads less than BASE
Flag BASE (465642e) HEAD (11a2a31)
25 5
Additional details and impacted files
@@              Coverage Diff              @@
##           breaking    #2550       +/-   ##
=============================================
- Coverage     84.78%   24.79%   -60.00%     
=============================================
  Files            22       22               
  Lines          1466     1460        -6     
=============================================
- Hits           1243      362      -881     
- Misses          223     1098      +875     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Member Author

@mhauru mhauru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went through the code and highlighted with comments any bits that seemed like they weren't done.

With @penelopeysm we agreed that this PR is getting unwieldy. All the obvious interface changes should now be done. The things that don't work are getting logp from samplers, Gibbs, and particle MCMC. We should make individual PRs to fix those, merge them into this one, and then check the comments I left to make sure everything is done. Except for the comments raised, I consider the code in here currently to be ready to merge once tests pass.

@penelopeysm feel free to likewise flag any parts of the code that we must not forget to attend to before merging this. Hopefully we can then not do another review of what's here now.

Comment on lines +94 to +95
[sources]
DynamicPPL = {url = "https://github.com/TuringLang/DynamicPPL.jl", rev = "breaking"}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Must not forget to remove this.

Comment on lines +78 to +80
# TODO(DPPL0.37/penelopeysm): This is obviously incorrect. Fix this.
vi = DynamicPPL.setloglikelihood!!(vi, Q.ℓq)
vi = DynamicPPL.setlogprior!!(vi, 0.0)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Must not forget to fix this.

Comment on lines +129 to +135
# TODO(DPPL0.37/penelopeysm) This is obviously wrong. Note that we
# have the same problem here as in HMC in that the sampler doesn't
# tell us about how logp is broken down into prior and likelihood.
# We should probably just re-evaluate unconditionally. A bit
# unfortunate.
DynamicPPL.setlogprior!!(new_varinfo, 0.0)
DynamicPPL.setloglikelihood!!(new_varinfo, new_logp)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Must not forget to fix this.

Comment on lines +181 to +183
# TODO(mhauru) Fix accumulation here. In this branch anything that gets
# accumulated just gets discarded with `_`.
value, _ = DynamicPPL.tilde_assume(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Must not forget to fix this.

Comment on lines +238 to +245
# Re-evaluate to calculate log probability density.
# TODO(penelopeysm): This seems a little bit wasteful. Unfortunately,
# even though `t.stat.log_density` contains some kind of logp, this
# doesn't track prior and likelihood separately but rather a single
# log-joint (and in linked space), so which we have no way to decompose
# this back into prior and likelihood. I don't immediately see how to
# solve this without re-evaluating the model.
_, vi = DynamicPPL.evaluate!!(model, vi)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Must not forget to attend to this and make a call on whether to pay the cost of re-evaluating or live without a logp.

Comment on lines +323 to +327
# TODO(DPPL0.37/penelopeysm): This is obviously incorrect. We need to
# re-evaluate the model.
set_namedtuple!(vi, trans.params)
return setlogp!!(vi, trans.lp)
vi = DynamicPPL.setloglikelihood!!(vi, trans.lp)
return DynamicPPL.setlogprior!!(vi, 0.0)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Must not forget to fix this.

Comment on lines +358 to +362
# TODO(DPPL0.37/penelopeysm): This is obviously incorrect. We need to
# re-evaluate the model.
vi = DynamicPPL.unflatten(vi, trans.params)
vi = DynamicPPL.setloglikelihood!!(vi, trans.lp)
return DynamicPPL.setlogprior!!(vi, 0.0)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Must not forget to fix this.

Comment on lines +432 to +433
# TODO(DPPL0.37/penelopeysm) The whole tilde pipeline for particle MCMC needs to be
# thoroughly fixed.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Must not forget to fix this.

Comment on lines +81 to +82
[sources]
DynamicPPL = {url = "https://github.com/TuringLang/DynamicPPL.jl", rev = "breaking"}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Must not forget to remove this.

@penelopeysm
Copy link
Member

Seems to me that there are several stages:

  • Introduce logjac acc in DPPL
  • Use that here
  • Fix sampler logp issue here
  • Fix Gibbs
  • Fix pMCMC

I'll go off and get started on the logjac acc, shouldn't be too hard since it's all the same code as before.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants